83 research outputs found
Training Machine Translation for Human Acceptability
Discriminative training, a.k.a. tuning, is an important part of Statistical Machine Translation. This step optimises weights for the several statistical models and heuristics used in a machine translation system, in order to balance their relative effect on the translation output. Different weights lead to significant changes in the quality of translation outputs, and thus selecting appropriate weights is of key importance.
This thesis addresses three major problems with current discriminative training methods in order to improve translation quality. First, we design more accurate automatic machine translation evaluation metrics that have better correlation with human judgements. An automatic evaluation metric is used in the loss function in most discriminative training methods, however what the best metric is for this purpose is still an open question.
In this thesis we propose two novel evaluation metrics that achieve better correlation with human judgements than the current de facto standard, the BLEU metric. We show that these metrics can improve translation quality when used in discriminative training.
Second, we design an algorithm to select sentence pairs for training the discriminative learner from large pools of freely available parallel sentences.
These resources tend to be noisy and include translations of varying degrees of quality and suitability for the translation task at hand, especially if obtained using crowdsourcing methods. Nevertheless, they are crucial
when professionally created training data is scarce or unavailable. There is very little previous research on the data selection for discriminative training. Our novel data selection algorithm does not require knowledge of the test set nor uses decoding outputs, and is thus more generally useful and efficient. Our experiments show that with this data selection algorithm, translation quality consistently improves over strong baselines.
Finally, the third component of the thesis is a novel weighted ranking-based optimisation algorithm for discriminative training. In contrast to previous approaches, this technique assigns a different weight to each training
instance according to its reachability and its relationship to test sentence being decoded, a form of transductive learning. Our experimental results show improvements over a modern state-of-the-art method across different language pairs.
Overall, the proposed approaches lead to better translation quality when compared strong baselines in our experiments, both in isolation and when combined, and can be easily applied to most existing statistical machine translation approaches
Examining Temporalities on Stance Detection Towards COVID-19 Vaccination
Previous studies have highlighted the importance of vaccination as an
effective strategy to control the transmission of the COVID-19 virus. It is
crucial for policymakers to have a comprehensive understanding of the public's
stance towards vaccination on a large scale. However, attitudes towards
COVID-19 vaccination, such as pro-vaccine or vaccine hesitancy, have evolved
over time on social media. Thus, it is necessary to account for possible
temporal shifts when analysing these stances. This study aims to examine the
impact of temporal concept drift on stance detection towards COVID-19
vaccination on Twitter. To this end, we evaluate a range of transformer-based
models using chronological and random splits of social media data. Our findings
demonstrate significant discrepancies in model performance when comparing
random and chronological splits across all monolingual and multilingual
datasets. Chronological splits significantly reduce the accuracy of stance
classification. Therefore, real-world stance detection approaches need to be
further refined to incorporate temporal factors as a key consideration
Similarity-Aware Multimodal Prompt Learning for Fake News Detection
The standard paradigm for fake news detection mainly utilizes text
information to model the truthfulness of news. However, the discourse of online
fake news is typically subtle and it requires expert knowledge to use textual
information to debunk fake news. Recently, studies focusing on multimodal fake
news detection have outperformed text-only methods. Recent approaches utilizing
the pre-trained model to extract unimodal features, or fine-tuning the
pre-trained model directly, have become a new paradigm for detecting fake news.
Again, this paradigm either requires a large number of training instances, or
updates the entire set of pre-trained model parameters, making real-world fake
news detection impractical. Furthermore, traditional multimodal methods fuse
the cross-modal features directly without considering that the uncorrelated
semantic representation might inject noise into the multimodal features. This
paper proposes a Similarity-Aware Multimodal Prompt Learning (SAMPLE)
framework. First, we incorporate prompt learning into multimodal fake news
detection. Prompt learning, which only tunes prompts with a frozen language
model, can reduce memory usage significantly and achieve comparable
performances, compared with fine-tuning. We analyse three prompt templates with
a soft verbalizer to detect fake news. In addition, we introduce the
similarity-aware fusing method to adaptively fuse the intensity of multimodal
representation and mitigate the noise injection via uncorrelated cross-modal
features. For evaluation, SAMPLE surpasses the F1 and the accuracies of
previous works on two benchmark multimodal datasets, demonstrating the
effectiveness of the proposed method in detecting fake news. In addition,
SAMPLE also is superior to other approaches regardless of few-shot and
data-rich settings
VaxxHesitancy: A Dataset for Studying Hesitancy Towards COVID-19 Vaccination on Twitter
Vaccine hesitancy has been a common concern, probably since vaccines were
created and, with the popularisation of social media, people started to express
their concerns about vaccines online alongside those posting pro- and
anti-vaccine content. Predictably, since the first mentions of a COVID-19
vaccine, social media users posted about their fears and concerns or about
their support and belief into the effectiveness of these rapidly developing
vaccines. Identifying and understanding the reasons behind public hesitancy
towards COVID-19 vaccines is important for policy markers that need to develop
actions to better inform the population with the aim of increasing vaccine
take-up. In the case of COVID-19, where the fast development of the vaccines
was mirrored closely by growth in anti-vaxx disinformation, automatic means of
detecting citizen attitudes towards vaccination became necessary. This is an
important computational social sciences task that requires data analysis in
order to gain in-depth understanding of the phenomena at hand. Annotated data
is also necessary for training data-driven models for more nuanced analysis of
attitudes towards vaccination. To this end, we created a new collection of over
3,101 tweets annotated with users' attitudes towards COVID-19 vaccination
(stance). Besides, we also develop a domain-specific language model (VaxxBERT)
that achieves the best predictive performance (73.0 accuracy and 69.3 F1-score)
as compared to a robust set of baselines. To the best of our knowledge, these
are the first dataset and model that model vaccine hesitancy as a category
distinct from pro- and anti-vaccine stance.Comment: Accepted at ICWSM 202
Sheffield systems for the English-Romanian translation task
© 2016 The Authors. Published by Association for Computational Linguistics. This is an open access article available under a Creative Commons licence.
The published version can be accessed at the following link on the publisher’s website: http://dx.doi.org/10.18653/v1/W16-2307This work was supported by the QT21 (H2020 No.645452) project
A Large-Scale Comparative Study of Accurate COVID-19 Information versus Misinformation
The COVID-19 pandemic led to an infodemic where an overwhelming amount of
COVID-19 related content was being disseminated at high velocity through social
media. This made it challenging for citizens to differentiate between accurate
and inaccurate information about COVID-19. This motivated us to carry out a
comparative study of the characteristics of COVID-19 misinformation versus
those of accurate COVID-19 information through a large-scale computational
analysis of over 242 million tweets. The study makes comparisons alongside four
key aspects: 1) the distribution of topics, 2) the live status of tweets, 3)
language analysis and 4) the spreading power over time. An added contribution
of this study is the creation of a COVID-19 misinformation classification
dataset. Finally, we demonstrate that this new dataset helps improve
misinformation classification by more than 9% based on average F1 measure
Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation
Medical systematic reviews can be very costly and resource intensive. We
explore how Large Language Models (LLMs) can support and be trained to perform
literature screening when provided with a detailed set of selection criteria.
Specifically, we instruction tune LLaMA and Guanaco models to perform abstract
screening for medical systematic reviews. Our best model, Bio-SIEVE,
outperforms both ChatGPT and trained traditional approaches, and generalises
better across medical domains. However, there remains the challenge of adapting
the model to safety-first scenarios. We also explore the impact of multi-task
training with Bio-SIEVE-Multi, including tasks such as PICO extraction and
exclusion reasoning, but find that it is unable to match single-task
Bio-SIEVE's performance. We see Bio-SIEVE as an important step towards
specialising LLMs for the biomedical systematic review process and explore its
future developmental opportunities. We release our models, code and a list of
DOIs to reconstruct our dataset for reproducibility
Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification
Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning
techniques designed to make the training of language models more efficient.
Previous results demonstrated that these methods can even improve performance
on some classification tasks. This paper complements the existing research by
investigating how these techniques influence the classification performance and
computation costs compared to full fine-tuning when applied to multilingual
text classification tasks (genre, framing, and persuasion techniques detection;
with different input lengths, number of predicted classes and classification
difficulty), some of which have limited training data. In addition, we conduct
in-depth analyses of their efficacy across different training scenarios
(training on the original multilingual data; on the translations into English;
and on a subset of English-only data) and different languages. Our findings
provide valuable insights into the applicability of the parameter-efficient
fine-tuning techniques, particularly to complex multilingual and multilabel
classification tasks
- …